
STOP 



Early Journal Content on JSTOR, Free to Anyone in the World 

This article is one of nearly 500,000 scholarly works digitized and made freely available to everyone in 
the world by JSTOR. 

Known as the Early Journal Content, this set of works include research articles, news, letters, and other 
writings published in more than 200 of the oldest leading academic journals. The works date from the 
mid-seventeenth to the early twentieth centuries. 

We encourage people to read and share the Early Journal Content openly and to tell others that this 
resource exists. People may post this content online or redistribute in any way for non-commercial 
purposes. 

Read more about Early Journal Content at http://about.jstor.org/participate-jstor/individuals/early- 
journal-content . 



JSTOR is a digital library of academic journals, books, and primary source objects. JSTOR helps people 
discover, use, and build upon a wide range of content through a powerful research and teaching 
platform, and preserves this content for future generations. JSTOR is part of ITHAKA, a not-for-profit 
organization that also includes Ithaka S+R and Portico. For more information about JSTOR, please 
contact support@jstor.org. 



SO [Feb. 



On the Relative Value of Averages derived from different numbers of 
Observations. By William A. Guy, M.B., Cantab, Fellow of the 
Royal College of Physicians; Professor of Forensic Medicine, King's 
College; Physician to King's College Hospital; Honorary Secretary 
of the Statistical Society. 

[Read before the Statistical Society of London, 16th April, 1849.] 

The subject of tbis communication has been usually treated as a branch 
of the mathematics; and few, if any, attempts have been made to 
illustrate it by means of observation. The labour which this mode of 
illustration must necessarily entail has probably disinclined men from 
its adoption. On the other hand, abstract reasoning possesses the 
twofold advantage of brevity and certainty. The mathematician sees 
at once that any attempt to establish broad principles, or to construct 
formula, by the aid of observation alone, must necessarily fail, from the 
vast number of facts which would be required before even a starting- 
point for calculation could be reached. To take a familiar example: — 
the chances of throwing doublets with the dice may be very easily 
calculated; but days, weeks, or even months, might be spent in a vain 
attempt to deduce these chances from actual observation. So, also, 
with Vital Statistics. It would not be possible to accumulate a suffi- 
cient number of facts in illustration of the age at death of any single 
class of persons to enable us to determine how many of such facts 
would suffice to establish a true average. There is only one obvious 
method by which such a result could be accomplished; and that is, by 
placing, side by, side, two averages based upon the same number 
of facts, and adding an equal number of facts to either total, till the 
difference between the averages fell to zero, and continued to stand 
at that point through several successive additions. Even if we were 
to content ourselves with a rougher approximation, it could only be 
reached by a very laborious series of calculations, which might not, 
after all, repay the pains bestowed upon them. It is not, therefore, 
with any hope of solving the difficult question — How many observa- 
tions are necessary to obtain a true average? that the following facts 
are adduced ; but simply to furnish an illustration, imperfect though 
it be, of the variable results obtained by actual observation on a limited 
scale. The facts themselves, on which the averages are founded, 
happen to have been collected in such a manner as to fit them for the 
use to which they are now put; and having spent some time in the 
calculations, I am unwilling that the results should be lost. They 
may perhaps possess some immediate interest, and be put hereafter to 
some useful purpose. 

The facts to which I refer consist of the ages, at death, of the 
members of the several ranks and professions, and have already 
supplied the materials of a series of communications to the Statistical 
Society. Three collections of such facts are available for my presetit 
purpose: the first, relating to the duration of life of the English 
aristocracy; the second, to that of kings; the third, to that of the large 
mixed class, whose deaths are recorded in the obituaries of the Annual 



1 850.2 Relative Value of Averages from different Observations. 91 

Begister, consisting of almost every man of note in the higher and 
middle ranks of society during about three quarters of a century. 

It may be well to premise that, in throwing these several classes 
of facts into groups, care has been taken to .avoid everything bordering 
on selection. Those facts which came first to hand are placed first, 
and those which followed in the order of collection take the second 
place in the tables. The same caution has been observed in adding 
group to group. The first use which I propose to make of the facts 
is to arrange them in averages of 25, 50, 75, 100, and so on up to 
1000, in two parallel columns, with a third column of differences, with 
a view of ascertaining the rate and degree of approximation of the two 
series. 

Table I. 



Average Age at Death. 



No. of Facte. 


1st Series. 


2nd Series. 


Difference. 


25 


60-96 
59-46 
61-30 
61-78 
61-10 
61-90 
62-44 
61-34 
61-07 
60-56 
59-95 
59-77 
59-80 
59-84 
59-93 
59-35 
59-42 
59-38 
59-37 
59-42 
59-26 
59-14 
59-34 
59-48 
59-53 
59-53 
59-58 
59-58 
59-80 
59-79 
59-72 
59-84 
59-96 
59-77 
59-74 
59-78 
59-87 
59-87 
59-90 
59-75 


61-24 
60-66 
61-20 
61-98 
61-42 
60-61 
60-67 
60-85 
60-44 
59-93 
59-63 
60-10 
59-88 
60-14 
59-85 
59-98 
6000 
60-23 
60-43 
60-66 
60-57 
60-74 
60-62 
60-75 
60-56 
60-26 
60-28 
60-31 
60-30 
60-49 
60-38 
60-67 
60-52 
60-62 
60-61 
60-47 
60-56 
60-46 
60-38 
60-38 


0-28 


50 


1-20 


75 


0*10 


100 


0-20 


125 


0-32 


150 


1-29 


175 


1-77 


200 


0-49 


225 


0-63 


250 


0-63 


275 


0-32 


300 


0-33 


325 


0-88 


350 


0-30 


375 


0-08 


400 


0-63 


425 


0-58 


450 


0-85 


475 


1-06 


500 


1-24 


525 


1-31 


550 


1-60 


575 


1-28 


600 


1-27 


625 


1-03 


650 


0-73 


675 


0-70 


700 


0-73 


725 


0-50 


750 


0-70 


775 


0-66 


800 


0-83 


825 


0-56 


850 


0-85 


875 


0-87 


900 


0-69 


925 


0-69 


950 


0-59 


975 


0'48 


1,000 


0-63 







32 Relative Vahie of Averages derived front [[Feb. 

Table I. consists of two such columns, each headed by an average 
of 25 facts, deduced from the first 50 ages at death, extracted from 
the peerage. Each group of 25 facts becomes 50 in the second line, 
75 in the third line, 100 in the fourth line, each group increasing by 
25 at a time until the last average in each column is based on 1000 
facts. The facts successively added were taken, like the first hundred 
facts, from the peerage, and. then from the baronetage, in the exact 
order in which they occur. 

This table will at least serve to show the hopelessness of the 
attempt to discover by the mere accumulation of observations, the 
number of facts which may be necessary to furnish a true average. It 
will be seen that the two columns of figures continue to exhibit as 
wide divergences and as marked fluctuations between the average 
values derived from a large as from a small number of facts; that the 
first difference is below the average of all the differences: and that, 
with a single exception, the averages derived from 75 facts show a 
closer approximation than any other figures in the two columns. The 
superiority of a large to a small number of facts shows itself, however, 
in the steadiness of the results (if we strike off the decimals) in all the 
averages obtained from more than 250 facts in the first column, and 
from upwards of 400 in the second column. It will also be observed 
that the difference between the highest and lowest average in each 
column of figures is very small. In the first column the greatest 
average is 62-44, the least 59*14, and the difference 3 - 30; in the 
second column, the highest average is 61-98, the lowest 59*63, and the 
difference 2-35. It would appear, then, if this table were taken as a 
guide, that averages drawn from even a small number of facts do not 
lead to those extreme inaccuracies to which they are generally supposed 
to be liable. But, as conclusions based upon a single fact, or a single 
collection of facts, must always be viewed with suspicion, it is very 
desirable to extend our investigation so as to embrace at least a second 
collection of figures of the same order. My inquiries into the duration 
of life among sovereigns have supplied me with the means of effecting 
this on a small scale, in Table II. 

In this table it will be observed that the first average of the first 
column falls short of the first average of the second column by nearly 
five years, and that all the subsequent averages of the two columns 
exhibit a much closer approximation. In like manner, the difference 
between the highest and lowest averages of either column is more 
considerable, being 54 - 84 — 48-88 = 5*96, in the one case, and 
56-91 — 52-92 = 3-99 in the other. Both columns also exhibit a 
greater amount of fluctuation than those of Table I. The natural 
inference to be drawn from this circumstance is, that the slight differ- 
ence between the first averages of the two columns in the first table 
-was a mere coincidence, and that the more uniform and steady cha- 
racter of that table is due to such coincidence. 

It is obvious, therefore, that tables constructed upon this principle 
would be liable to lead careless reasoners into error, and that a different 
arrangement of the elementary facts is necessary to the right under- 
standing of the relative value of different numbers of facts as data for 
reasoning. 



1850.3 



different Numbers of Observations. 
Table II. 



33 



Average Age at Death. 



No. of Fao s. 


1st Series. 


2nd Series. 


Difference. 


25 


48-88 
51-24 
53-79 
54-84 
54-61 
54-38 
54-17 
54-34 
54-50 
54-36 
54-09 
53-82 
54.-01 
54-40 
54-28 
54-14 
54-35 
54-37 
53-90 
5382 
53-86 
53-92 


53-72 
52-92 
54-23 
55-56 
56-08 
56-91 
56-21 
55-99 
56-19 
56-14 
55-93 
55-80 
55-91 
55-85 
56-08 
56-20 
55-79 
55-57 
55-26 
55-28 
5505 
54-77 


4-84 


50 


1-68 


75 


0-44 


100 


0-72 


125 


1-47 


150 


2-53 


175 


2-04 


200 


1-65 


225 


1-69 


250 


1-78 


275 


1-84 


300 


1-98 


325 


1-90 


350 


1-45 


375 

400 


1-80 
2-00 


425 


1-44 


450 


1-20 


475 


1-36 


500 


1-36 


525 


1-19 


550 


0-85 







The following tables, of which the first is founded on the average 
ages, at death, of the peerage and baronetage, and the second, on 
the average ages, at death, of the members of the several ranks and 
professions, as deduced from the " Annual Register," will be found to 
comprise most of the elements of an instructive comparison. The 
facts in both instances were taken without selection, in the order 
in which they stood in my original papers. The several facts were 
first arranged in a line in groups of 25 each, two successive groups of 
25 were then formed into groups of 50, the groups of 50 into groups of 
100, and so on, till the last totals in the tables were obtained. 

Table III. 



No. of 


No. of 
Groups. 


Average Age at Death. 


Facts. 


Maximum. 


Minimum. 


Range. 


25 


64 

32 

16 

8 

4 

2 

1 


69-40 
66-44 
63-70 
62-38 
61-10 
60-84 


50-64 
5520 
56-85 
57-61 
58-24 
59-67 


18-76 


50 


11-24 


100 


6-85 


200 


4-77 


400 


2-86 


800 


117 






1,600 


60-25 









VOL. XIII. PART I. 



34 



Relative Value of Averages derived from 
Table IV. 



[Feb. 



No. of 
Facts. 


No. of 

Groups. 


Average Age at Death. 


Maximum. 


Minimum. 


Range. 


50 


128 

64 

32 

16 

8 

4 

2 

1 


84-44 
76-24 
73-54 
69-78 
68-67 
67-93 
66-38 


56-78 
58-25 
61-50 
63-51 
65-07 
64-84 
65-82 


27-66 


100 


17-99 


200 


12-04 


400 


6-27 


800 


3-60 


1,600 


3-09 


3,200 


0-56 






6,400 


66-10 













Each of these tables exhibits, in a very striking manner, the wide 
difference which may exist between averages deduced from small 
numbers of facts. The first table, founded on facts of a very uniform 
character, namely, the ages, at death, of what may be termed the one 
class of the English aristocracy, exhibits, for averages based on the 
small number of 25 observations, a possible difference between the 
highest and lowest average of nearly 19 years, being considerably 
above one-fourth of the highest average. In the second table, which 
is formed of units differing more widely from each other, inasmuch as 
they represent the ages at death of several classes of the community, 
the difference between the highest and lowest average of 50 obser- 
vations is, in round numbers, 28 years, or n'early one-third of the 
highest average. If we take the true average in the first table to be 
60 years (the mean of 1600 observations), then the greatest average 
exhibits an error of more than 9 years, and the least average an error 
of about the same amount. If, again, in the second table, we assume 
the true average to be 66 (the average of 6,400 observations), we may 
have an error in excess to the extent of 1 8 years, and an error in defect 
to the extent of 10 years. But when these averages are used for 
purposes of comparison, it is obvious that the possible error may 
amount to the sum of the two errors involved in the two averages 
respectively. Suppose, for example, that we wish to ascertain the 
relative longevity of the members of the English aristocracy and of the 
entire upper and middle class, and we proceed to determine the question 
by averages founded on 50 observations, it might happen that the 
average for the first- named class was the minimum 55*20, and for the 
last-named class the maximum 84-44. The difference between these 
two numbers is 29-24. But on the assumption that the true averages 
are 60*25 and 66*38, the true difference will be only 6*13. So that 
the possible error is no less than 29-24 — 6-13, or 23 - l 1. 

It will be seen that the limits of possible error diminish rapidly 
with an increase of observations. Thus, if we take the first table, and 
express the range, or difference, in round numbers, the limit of error 
in excess or defect diminishes as half the respective numbers 19, 11, 7, 
5, 3, 1. In the case of the second table, the limits of error will be as 
the half of the several round numbers 28, 18, 12, 6, 4, 3, 1. We look 
in vain for a numerical law of approximation, unless, indeed, the round 



1850.J different Numbers of Observations. 35 

numbers 1, 3, 5, 7, in the last four lines of the first table, exhibiting, 
as they do, an arithmetical progression with the number 2 as the 
common difference, may be regarded as an indication of such a law. 
"We could only adopt this supposition by the somewhat arbitrary 
assumption that the law does not begin to display itself till a rude ap- 
proximation to a true average has been obtained by summing up 100 
observations. The great irregularity of the second table, moreover, 
would seem to forbid this view of the case. 

There is one defect in these tables which may serve to explain the 
absence of any decided approach to regularity in the figures which 
represent the limits of variation; namely, the circumstance that the 
number of the groups diminishes as the number of facts increases. 
For instance, the wide range of 19 years, which appears in the first 
line of the first table, is the difference between the highest and lowest 
averages of no less than 64 groups of 25 facts, while the limited range 
of one year is obtained from only two groups of 800 facts each. If any 
approximation to a numerical law of increase or decrease is to be 
looked for as the result of observation, it is clearly not reasonable to 
expect it except from a comparison of the same number of groups. 
And although it is probable that the chance of the discovery of such 
a law in such a manner would be small, unless the groups were not 
merely equal in number, but also numerous, I have thought that some 
traces of such a law might possibly be discovered in a collection of 
facts in which the groups should be equal, though limited in number. 
The eighth Annual Report of the Registrar-General furnishes the 
materials for a comparison of this kind, by presenting us with the 
number of male and female births for the several counties and regis- 
tration districts of England, during each of the six years 1839-44. 
From this return I have selected, without previous calculation of the 
results, and with an eye solely to the number of the facts, certain 
districts and counties; and, having calculated for each of the six years, 
the number of male births which would have happened in one million 
of births, have given in separate columns the average number of births 
recorded in the year, the greatest and least number of male births in 
a million, and the range or difference between the two extremes. The 
results are embodied in Table V. 

This table, in common with Tables III. and IV., serves to enforce 
the little dependence to be placed on small numbers of facts, but it is 
still less favourable to the discovery, by observation, of any numerical 
law of approximation. The figures which express the range or differ- 
ence between the highest and the lowest number of male births, pre- 
sent no approach to regularity; and there is every reason to believe 
that the fluctuations are due, not to the variable proportion of male 
and female births, but merely, or chiefly, to the larger or smaller 
number of facts. If we endeavour to arrive at some numerical law by 
throwing the several returns into larger groups, we are equally unfor- 
tunate. If, for instance, we throw the 28 separate returns into four 
groups of seven each, strike an average of each group, and place the 
last return for all England by itself, we obtain, as the differences 
between the highest and lowest number of male births in one million, 
the numbers 3,028, 6,584, 9,812, 24,648, and 59,655, in which it ia 
quite hopeless to attempt to trace any law of increase or decrease. 

d2 



36 



Relative Value of Averages derived from 



[Feb. 



We are equally baffled if, allowing the first and last return to stand by 
themselves, we distribute the remaining 27 returns into three groups 
of 9 each. The five differences in this case are as follows: 3,028, 6,663, 
12,605,47,163, 107,016. 

Table V. 



Name of District. 



Number of 
Births. 



Maximum. 



Minimum. 



Range. 



Salisbury 

Canterbury 

Penkridge 

Winchester 

Rutlandshire 

Staines and Uxbridge 

Northampton 

St. George in the East . 
Alresford and Petersfield. 

Huntingdonshire 

Middlesex (part of) 

Cambridgeshire 

Derbyshire 

Suffolk 

Gloucestershire , 

Kent 

Staffordshire 

South Wales 

Northern Counties 

Eastern Counties 

North Midland Counties 
South Midland Counties. 
South-Eastern Counties . 
South- Western Counties. 

York 

Metropolis 

Western Counties 

North-Western Counties 
England 



271 

368 

436 

615 

722 

858 

1,047 

1,515 

1,612 

2,032 

3,85-1 

5,940 

7,760 

10,011 

11,894 

14,256 

15,771 

16,727 

27,916 

32,212 

36,451 

37,764 

44,290 

52,656 

53,825 

59,422 

61,903 

76,721 

515,478 



543,798 
556,180 
554,113 
545,031 
527,246 
532,184 
524,281 
518,619 
541,256 
519,824 
517,260 
522,690 
517,746 
519,205 
522,582 
516,997 
516,439 
516,753 
517,735 
514,620 
519,801 
515,455 
519,268 
514,210 
515,244 
513,812 
515,808 
515,985 
514,809 



436,782 
482,500 
493,333 
492,228 
466,912 
498,195 
495,274 
480,771 
500,000 
485,030 
496,915 
507,188 
508,166 
505,996 
497,945 
506,651 
507,767 
512,006 
511,330 
509,765 
510,778 
507,205 
508,978 
509,368 
510,453 
509,491 
507,642 
510,557 
511,781 



107,016 

73,680 

60,780 

52,803 

60,304 

33,998 

29,007 

37,848 

41,256 

34,794 

20,345 

15,502 

9,582 

13,209 

24,637 

10,346 

8,672 

4,747 

6,405 

4,855 

9,023 

8,250 

10,290 

4,842 

4,791 

4,321 

8,166 

5,428 

3,028 



Being thus completely baffled in my attempt to discover, by means 
of such observations as were most readily available, a numerical law 
of approximation, expressive of the relative value of averages founded 
upon different numbers of facts, I proceeded to compare the results of 
observation, in the matter of male and female births, with the liability 
to error, as derived from well known and generally received mathe- 
matical formulae, of the several averages founded upon few or many 
facts. 

But before I proceed to institute this comparison between the 
results of observation and the figures derived from the formula of the 
mathematician, I would revert, for a moment, to the tables already 
brought forward, with a view of throwing light upon a question of 
the greatest practical importance, namely, to what extent are we 
justified in employing averages founded upon small numbers of 
observations ? 

One of the leading results just established by the tables, is the 
wideness of the limits which separate the highest and lowest averages 



1850.] 



different Numbers of Observations. 



37 



derived from equal small groups of facts. For instance, the averages 
drawn from 64 groups of 25 ages at death of peers and baronets, 
ranged between a minimum of 51 and a maximum of 69 years, 
leaving a difference of no less than 18 years, the average of 1,600 
observations being about 60 years. In like manner 128 groups, con- 
taining each 50 ages at death, of the mixed upper and middle classes, 
gave a maximum of 84, a minimum of 57, and a range of no less than 
27 years, the average of 6,400 facts being 66 years. The extremes 
approached, and the range contracted rapidly as the numbers of obser- 
vations in each group increased. 

Now this wide divergence of the extreme values derived from 
equal small groups of facts would seem at first sight to be absolutely 
fatal to the use of such small collections of facts for statistical pur- 
poses. But before we adopt this conclusion, it would be well to 
examine the several averages which lie between the two extremes, 
with a view to determine whether they are distributed equally or not ; 
what are the chances that any average taken at hazard will approxi- 
mate to the true mean, or to the extremes ; and, generally, whether 
the chance of encountering, in the majority of instances, a near 
approach to the true value, may not be such as to warrant the em- 
ployment of even small collections of facts, if not as demonstrative 
evidence, at least as valuable probabilities ? 

With a view of throwing some light upon this very important 
question, I have prepared two tables, constructed in the same manner, 
and showing, for the several groups of facts, the precise number of 
averages corresponding to each age lying between the two extremes. 
Each average is referred to the round numbers to which it is nearest. 
The average of all the facts is distinguished by a larger type. 

Table VI. 
Facts taken from the Peerage and Baronetage. 



Average 


25 Facts, 


SO Facts, 


100 Facts, 


200 Facts, 


400 Facts, 


800 Facts, 


Age. 


72 Groups. 


36 Groups. 


18 Groups. 


9 Groups. 


4 Groups. 


2 Groups. 


69 


1 








.... 


.... 


68 







.... 


.... 


.... 


.... 


67 





.... 


.... 


.... 


.... 


.... 


66 


3 


1 




.... 


.... 


.... 


65 


4 







.... 


.... 


.... 


64 


9 


2 


"i 


.... 


.... 


.... 


63 


3 


3 


l 


.... 


.... 


.... 


62 


3 


5 


3 


3 


.... 


.... 


61 


9 


6 


3 





3 


1 


60 


9 


4 


3 


2 





1 


59 


8 


5 


3 


3 





.... 


58 


6 


5 


] 


1 


1 


.... 


57 


5 


1 


3 


.... 


.... 


.... 


56 


6 


3 


.... 


.... 


.... 


.... 


55 


1 


1 


.... 


.... 


.... 


.... 


54 


3 




.... 


.... 


.... 


.... 


53 


1 


.... 


.... 


.... 


.... 


.... 


52 





.... 




.... 


.... 


.... 


51 


1 


.... 


.... 


»... 


.... 





38 



Relative Value of Averages derived from 



[Feb. 



Table VII. 
Facts taken from, the Annual Register. 



Average 


BO Facts, 100 1 


Pacts, 


200 Facts, 


400 Facts, 


800 Facts, 


1,600 Facts, 


3,200Faots, 


Age. 


128 Groups. 64 G 


roups, 


32 Groups. 


16 Groups. 


8 Groups. 


4 Groups. 


2 Groups. 


84 


1 










.... 




83 







.... 










82 









.... 








81 











.... 






80 











.... 


.... 


.... 


79 











.... 




.... 


78 





... 












77 











.... 






76 


1 


i 












75 


1 







.... 








74 


4 


l 


i 


.... 








73 


2 


l 













72 


2 


i 













71 


5 
















70 


4 


4 





i 








69 


7 


3 


2 





i 






68 


13 


5 


3 


i 





i 




67 


17 


9 


4 


4 


l 





.... 


66 


20 


13 


14 


4 


2 


i 


2 


65 


5 


13 


2 


4 


3 


2 


.... 


64 


16 


6 


2 


2 


1 






63 


11 


1 


2 








.... 


62 


10 


4 


1 










61 


3 


1 


1 


• >•• 








60 


3 









.... 






59 
















...* 


58 





1 


.... 










57 


3 


.... 


.... 




.... 







These tables supply a ready answer to the questions just pro- 
pounded : they show at a glance, what a. priori reasoning would 
suggest, that the individual averages are few in number in the direc- 
tion of the extremes, and numerous as we approach the true average. 
In the first table, for instance, in no less than 9 cases out of 72, or 
one-eighth of the whole number, the averages of 25 facts are the same 
as the true average, namely 60 j while, in the second table, 20 out of 
128 groups of 50 facts, or little less than one-sixth of the whole 
number, yield the true average, naniely, 66. The remaining columns 
of the two tables present similar results ; but those of the second table 
are the most striking. Thus, of the 64 groups of 100 facts, 13, or 
about one-fifth, of the 32 groups of 200 facts, 14, or nearly one-half, 
of the 16 groups of 400 facts, 4, or one-fourth, and of the groups of 
800 and 1600 facts, the like proportion are found to coincide with the 
true average derived from all the facts. 

If, again, we take the instances in which the averages exceed, and 
fall short of the true average by only one year (which may be regarded 
as a very near approach to accuracy), we find that, in the first table, 
26 out of 72 groups of 25 facts, 15 out of 36 groups of 50 facts, 9 out 
of 18 groups of 100 facts, 5 out of 9 groups of 200 facts, and 3 out of 
4 groups of 400 facts, or upwards of one-third, little less than one- 



1850.] different Numbers of Observation!, 39 

half, exactly one-half, more than one-half, and three-fourths respec- 
tively, answer to this description. So also in the second table : 42 in 
128 groups of 50, 35 in 64 groups of 100, 20 in 32 groups of 200, 12 
in 16 groups of 400, 6 in 8 groups of 800, and 3 out of 4 groups of 
1600, correspond with the average of all the facts. There is, therefore, 
no room for doubt that the chances in favour of an average, approaching 
very closely to the true average, greatly overbalance those in favour of 
an average approaching either extreme, even when the number of facts 
from which such average is calculated is inconsiderable. 

The progressive and steady increase or decrease in series of averages 
of 25 or 50 facts (as in my observations on the frequency of the pulse 
in the two sexes at different ages), may be adduced in confirmation of 
the result of the foregoing tables, and as a sufficient reason for not 
rejecting conclusions based upon a small number of observations*. 

The materials of the foregoing tables are the ages at death of 
members of the aristocracy, and of the combined upper and middle 
classes. I now propose to extend my inquiry, so as to embrace a 
different order of facts belonging to that class of investigations in 
which two alternative events, such as death or recovery from disease, 
the birth of a male or female child, &c, are in question. In order 
that the element of more or fewer facts might have full play, and be 
subject to the least possible disturbance, I have chosen a very simple 
alternative event, namely, the attendance of a male or female patient 
among the out-patients of an hospital. The facts have been carefully 
extracted from the physicians' out-patient book of the King's College 
hospital, in which are entered the names of all patients of both sexes 
above ten years, with exceptions which it is not necessary to specify. 
The male and female attendances were extracted from the books by 
fifties, in the exact order in which they were entered. The number of 
males in each fifty patients was then written down in a vertical line, 
two adjoining fifties were bracketed together to form a group of 100, 
two adjoining hundreds to form a group of 200, and so on till the 
grand total of 6,400 was obtained. 

I shall assume that the average of these 6,400 facts form the true 
average, and as such, shall use it for a standard of comparison. 

The most superficial examination of my tables convinced me that 
the fluctuations in this new order of facts were not less than those 
which I had encountered in the averages based upon the ages at death 
of different classes of the community. 

Table VIII., which is a counterpart of Tables III. and IV., ex- 
hibits the per centage attendances of males, in maxima and minima, as 
obtained from groups of 50, 100, 200, 400, 800, 1,600, and 3,200 facts. 

This table exhibits the same rapid but irregular approximations 
of the extreme values which characterize Tables III. and IV., and 
the same absence of any obvious numerical law of approximation. 
The per centage attendance of males which for 50 facts and 128 
averages has a range equal to the average of all the groups, namely, a 
range of 40, exhibits for the two averages of 3,200 facts a range of less 

* In the case of the female pulse, averages of 25 facts showed a regular and pro- 
gressive decrease for every period of 7 years from birth to the 56th year, and a 
progressive increase from that age to the end of life. Averages of the same number 
of facts exhibited for the male pulse a similar but less steady decrease and increase. 



40 



Belativs Value of Averages derived from 



[Feb. 



than one-fifth of an unit. Even the four groups of 1,600 facts display 
a range of less than one and a-half : the rate of approximation to a 
true average is, therefore, very rapid. 

Table VIII. 



Number of 
Facts. 



50 
100 
200 
400 
800 
1,600 
3,200 

6,400 



Number of 
Groups. 



128 

64 

32 

16 

8 

4 

2 



Attendance of Males in 100 Attendances. 



Maximum. Minimum. Range. 



64 
53 
46 
43$ 

40/ B 



24 

29 

34J 

36 

38 

391 



1Q7 



40 
24 

Hi 

n 

3i 



The probability of encountering a true average, or one equal to 
the average of all the facts, will also appear to be similar to that dis- 
played in Tables VI. and VII. There is the same tendency in the 
averages to arrange themselves about the true mean, rather than in 
the neighbourhood of the extremes. This will fully appear in Table 
IX., in which, in consequence of the necessity of reducing all the 
averages to per centage proportions, the numbers in the column of 50 
facts, being the figures extracted from the hospital books multiplied by 
2, will be found opposite to even numbers. The figures in the other 
columns are the round numbers nearest to the quotients of the totals 
divided by 2, 4, 8, 16, and 32, respectively. 

Having thus reproduced for a new order of facts the tables founded 
upon the ages at death of the aristocracy, and of the combined upper 
and middle classes, with a view of confirming the results arrived at by 
the use of these data, I proceed to institute the promised comparison 
between the results of actual observation and the calculations of the 
mathematician. 

The materials which I have selected for this purpose are the 
numbers of male and female births which took place in several counties 
and registration districts of England and Wales during each of the six 
years from 1839 to 1844 inclusive. The comparison in question is embo- 
died in Table X., of which the first column consists of numerals referring 
to the counties and registration districts specified in the foot-note; the 
second column, of the number of births registered in the several coun- 
ties or districts on an average of the six years ; the third column, of 
the mean number of male births in one million occurring in the six 
years ; the fourth column, of the greatest and least number of male 
births in the six years, with the range, or difference between them ; 
the fifth column, of the greatest and least number of male births 
which would have taken place in the same six years, on the supposi- 
tion that the male and female births are really equal in number, and 
that the observed difference between the male and female births is an 
error of observation ; the sixth column, of the error in excess or defect 
attaching to the number of facts upon that supposition ; the seventh 



1850.] 



different Numbers of Observations, 



41 



column, of the greatest and least number of male births in one mil- 
lion, which would result from the addition and substraction of the 
error due to the number of facts to and from the average male births 
in one million in the same six years ; and the eighth and last column, 
of the error in excess or defect to which the respective numbers of facts 
are liable. The formulae by which the figures in the last four columns 
were calculated are given in the foot-notes. 

Table IX. 



Average No. 
















of attendances 


SO Facts. 


100 Facts. 


200 Facts. 


400 Facts. 


800 Facts. 


1,600 Facts 


3,200 Facts. 


of Malea. 
















64 


1 














63 


.... 


.... 


.... 


.... 




.... 


.... 


62 






.... 


.... 


.... 


.... 


.... 


61 


.... 


.... 








.... 




60 


1 


.... 








.... 




59 
















58 




.... 




.... 




.... 


.... 


57 




• ••• 




.... 




.... 


.... 


56 


2 


.... 




.... 




.... 


.... 


55 


• ••• 












.... 


54 


1 












.... 


53 




i 








.... 


.... 


52 


3 










.... 




51 




•**• 




.... 






.... 


50 


...{ 






.... 


.... 


.... 


.... 


49 




i 






.... 




.... 


48 


8 


2 


.... 




.... 


.... 


.... 


47 




..•• 








.... 


.... 


46 


ii 


l 


i 




.... 


.... 


.... 


45 


.... 


2 


2 


.... 


.... 


.... 


.... 


44 


13 


7 


1 




.... 


.... 


.... 


43 




5 


2 


i 






.... 


42 


15 


6 


1 


l 


.... 


.... 


.... 


41 




5 


5 


4 


2 






40 


15 


5 


3 


5 


3 


3 


2 


39 




4 


3 


1 


2 


1 


.... 


3,8 


14 


5 


7 


2 


1 


.... 


.... 


37 


.... 


5 


3 


1 




.... 




36 


8 


6 


2 


1 






.... 


35 




1 


1 


.... 






.... 


34 


14 


3 


1 




.... 






33 




2 




.... 




.... 


.... 


32 


9 


2 


.... 




.... 




.... 


31 


.... 


.... 




.... 




.... 


.... 


30 


5 


.... 


.... 




«... 


.... 


.... 


29 


«... 


1 




.... 


.... 




.... 


28 


5 


.... 


.... 


.... 


.... 


.... 


.... 


27 








.... 


.... 


.... 




26 


i 


.... 




.... 


.... 




.... 


25 


.... 






.... 




.... 


.... 


24 


2 




.... 


.... 


.... 


.... 


.... 



The results of the comparison instituted in this table are interesting 
and instructive. On the supposition that the actual equality of male 
and female births in one district (No. 21,) in one of the six years, and 
the near approach to equality in other instances, had left us in doubt 



42 Relative Value of Averages derived from QFeb. 

whether the inequality in other years and districts might not be ex- 
plained by the errors attaching to a comparatively small number of 
observations, the mathematician would direct us to compare columns 
4 and 5, with a view to a solution of the difficulty. The figures in the 
fifth column are calculated upon the assumption, that the male and 
female births are naturally equal; but that in consequence of the 
limited number of facts embraced by the several returns, that equality 
is destroyed, and the proportion of male to female births is subject to 
great apparent fluctuation. If, then, this assumption of equality be 
false, the maxima of male births in the several districts ought uniformly 
to exceed the maxima obtained by calculation. In other words, in 
order to disprove the theory of equality, the maximum established by 
observation ought, in each return, to surpass the limits of possible error 
due to the numerical insufficiency of the facts. 

But a reference to the table will show that it is only in nineteen out 
of twenty-nine returns that the highest number of male births obtained 
in each of the six years, 1839-44, does exceed the maximum obtained 
by calculation. In the rest of the returns, being ten in number, the use 
of the calculations in column 5 would at least leave us in doubt, as the 
observed maxima are less than the maxima derived from calculation. 
In these ten cases, therefore, though the returns display an apparent 
excess of male births, that excess falling short of the maxima derived 
from calculation, we should remain in doubt whether the observed excess 
of male births was in the nature of things, or merely an error of ob- 
servation. To make this subject more easy of comprehension, I will 
take two cases in point — the returns for all England, marked 1, -and 
the returns for Huntingdonshire, marked 20. On referring to column 
5, it will be seen that, on the supposition of a perfect equality between 
the male and female births, 515,478 recorded births might yield any 
number between 500,624 and 499,376 male births in a million. 
Now, every one of the returns for the six years, 1839-44, greatly ex- 
ceeds the maximum 500,624; it may, therefore, be fairly assumed that 
the male and female births are not equal in number, but that, in the 
nature of things, there is an excess of male births. On the other hand, 
it appears by column 5, that, on the same supposition of a perfect 
equality of male and female births, we might encounter in 2,032 births 
recorded in the county of Huntingdon any number of male births in a 
million between 531,369 and 468,631. But every one of the six re- 
turns from Huntingdonshire falls short of the maximum 531,369. 
Hence we may infer that at least there is the greatest possible doubt 
whether the male and female births in that county may not be equal. 
Now, there is good reason to believe that in England, at least, the rule 
of an excess of male births is subject to no exception. It obtains uni- 
formly in the first fourteen districts of the table, from which the 
returns are most numerous ; it fails only where the facts are compara- 
tively few in number; and it is obviously in the highest degree im- 
probable that out of twenty-nine returns taken without selection from 
several hundreds, the first fourteen should be governed by one rule, 
and the last fifteen by another. From this table, then, it may be 
fairly inferred that the formulae of the mathematician are not applicable 
as tests to the results of observations founded on comparatively small 
numbers of facts. 



1850.] different Numbers of Observations, 43 

A comparison of the actual maxima and minima established by 
observation in the six years, 1839-4 4, with the maxima and minima 

calculated by the formulae — + j~l — ^— shows, as might be antici- 
pated, a greater divergence between the extremes derived from calcula- 
tion than between those established by observation within so limited a 
period. It is only in the single case of all England, where the number 
of facts exceeds half a million, that the extremes obtained by calcula- 
tion fall within those established by observation. 

On reviewing the tables contained in this communication to the 
Society, the following propositions may be put forward as fully 
warranted by them : 

1. That the range, or difference between the greatest and least aver- 
age derived from successive groups of small numbers of facts is very con- 
siderable, but that it diminishes rapidly as the number of facts increases. 

2. That the rate of approximation of the extreme values varies 
with each different order of facts, and that it does not appear to be 
amenable to any numerical law. 

3. That the greater the number of the elements which .determine 
the occurrence of the events that, when thrown into groups, constitute 
the materials of our averages, the greater should be the number of our 
facts; e. g. the average duration of life of the entire middle class 
ought to be deduced from a larger number of facts than the average 
duration of life of a single profession. 

4. That though the possible error to which a given small number 
of facts is liable is very large, there is always a fair probability in favour 
of any particular average coinciding with, or approaching very closely 
to, the true average. 

5. That the formulae of the mathematician have a very limited 
application to the results of observation; and that if incautiously 
applied, they may lead to very grave errors. 

6. That though averages derived from large numbers of facts are 
worthy of much greater confidence than those founded upon small 
numbers of facts, the latter class of averages are by no means to be 
rejected as useless, but should be employed as probabilities of greater 
or less value, as the number of facts is larger or smaller. 

These propositions lay no claim to originality. They are merely 
intended to express the conclusions legitimately to be drawn from the 
facts contained in this essay. 

Names of the Places from which the Returns in Table X. are taken. 
(The numbers are those in column I. of the Table.) 

1. England. 2. North-Western Counties. 3. Western Counties. 
4. Metropolis. 5. York. 6. South- Western Counties, 7. South- 
Eastern Counties. 8. South Midland Counties. 9. North Midland 
Counties. 10. Eastern Counties. 11. Northern Counties. 12. South 
Wales. 13. Staffordshire. 14. Kent. 15. Gloucestershire. 16. Suf- 
folk. 17. Derbyshire. 18. Cambridgeshire. 19. Middlesex (part of). 
20. Huntingdonshire. 21. Alresford and Petersfield. 22. St. George's 
in the East, London. 23, Northampton. 24. Staines and Uxbridge. 
25. Rutlandshire, 26. Winchester. 27. Penkridge. 28. Canter- 
bury. 29. Salisbury. 



44 



Relative Value of Averages derived from 



CFeb. 



Table X. 

Total Births in Seven Years in England, 3,636,383. Male Births in 1,000,000, 

512,504. 





Number 
of 


Males Bora in 1,000,000 
Births. 


The same 


Limit 


The same 


Limit 




Recorded 
Births. 


(By Observation.) 


by 
Calcula- 
tion. 


of 

Error, 

in 


by 
Calcula- 
tion. 


of 

Error, 


No. 






in 




(Average 

of 


Average 

of 
Six Years, 


Maximum, Mini- 
mum, and 
Range in Six Years, 


(Formula 
I.)* 


excess 

or 
defect. 


(Formula 
II.)t 


excess 

or 
defect. 




Six Years.) 


1839-44. 


1839-44. 
















(Max. 514,809 


500,624 




513,528 




1 


515,478 


512,904 


{Min. 511,781 
(Range 3,028 
[Max. 515,985 


499,376 

1,248 

505,099 


000,624 


512,280 

1,248 

518,242 


000,624 


2 


76,721 


513,242 


{Min. 510,557 
(Range 5,428 
|Max. 515,808 


494,901 

10,198 

505,657 


005,100 


508,142 

10,200 

518,300 


005,100 


3 


61,903 


512,643 


{Min. 507,642 
(Range 8,166 
Max. 513,812 


494,343 

11,314 

505,831 


005,657 


506,986 

11,314 

517,320 


005,657 


4 


59,422 


511,489 


{Min. 509,491 
(Range 4,321 
(Max. 515,244 
{Min. 510,453 
(Range 4,791 


494,169 

11,662 

506,083 


005,831 


505,658 

11,662 

518,458 


005,831 


5 


53,825 


512,375 


493,917 


006,083 


506,292 


006,083 








12,166 




12,166 


.... 








Max. 514,210 
(Min. 509,368 


506,164 




517,989 




6 


52,656 


511,825 


493,836 


006,164 


505,661 


006,164 








(Range 4,842 


12,328 




12,328 










(Max. 519,268 


506,708 




518,793 




7 


44,290 


512,085 


{Min. 508,978 
(Range 10,290 
(Max. 515,455 


493,292 

13,416 

507,280 


006,708 


505,377 

13,416 

517,752 


006,708 


8 


37,764 


510,472 


Min. 507,205 
(Range 8,250 
(Max. 519,801 


492,720 

14,560 

507,416 


007,280 


503,192 

14,560 

521,326 


007,280 


9 


36,451 


513,910 


{Min. 510,778 
(Range 9,023 
(Max. 514,626 


492,584 

14,832 

507,874 


007,416 


506,494 

14,832 

520,060 


007,416 


10 


32,212 


512,186 


{Min. 509,765 
(Range 4,855 
(Max. 517,735 


492,126 

15,748 

508,485 


007,874 


504,312 

15,748 

523,381 


007,874 


11 


27,916 


514,896 


{Min. 511,330 
(Range 6,405 
(Max. 516,753 


491,515 

16,970 

510,954 


008,485 


506,411 

16,970 

525,562 


008,485 


12 


16,727 


514,608 


{Min. 512,006 
(Range 4,747 
fMax. 516,439 


489,046 

21,908 

511,269 


010,954 


503,654 

21,908 

523,450 


010,954 


13 


15,771 


512,181 


{Min. 507,767 
(Range 8,672 
(Max. 516,997 


488,731 

22,538 

511,832 


011,269 


500,912 

22,538 

523,310 


011,269 


14 


14,256 


511,478 


{Min. 506,651 
(Range 10,346 
(Max. 522,582 


488,168 

23,664 

512,961 


Oli',832 


499,646 

23,664 

525,776 


01 1,832 


15 


11,894 


512,815 


{Min. 497,945 
(Range 24,637 
(Max. 519,205 


487,039 

25,922 

514,107 


012,'961 


499,854 

25,922 

526,437 


012,961 


16 


10,011 


512,330 


{Min. 505,996 
(Range 13,209 


485,893 
28,214 


014,107 


498,223 
28,214 


014,107 



1850.] 



different Numbers of Observations. 
Table X. — Continued. 



45 





Number 

of 
Recorded 
Births. 


Males Born in 1,000,000 

Births. 

(By Observation.) 


The same 

by 
Calcula- 


Limit 

of 
Error, 


The same 

by 
Calcula- 


Limit 

of 
Error, 


No. 






tion. 


in 


tion. 


in 






Average 


Maximum, Mini- 




excess 




excess 




(Average 
of 


of 
SixYears, 


mum, and 
Range in SixYears, 


(Formula 
I.)* 


or 
defect. 


(Formula 
II.)t 


or 

defect. 




Six Years.) 


1839-44. 


1839-44. 
















(Max. 517,746 


516,062 




528,396 




17 


7,760 


512,334 


Min. 508,166 
(Range 9,582 
[Max. 522,690 


483,938 

32,124 

518,330 


016,062 


496,272 

32,124 

530,664 


016,062 


18 


5,940 


514,275 


{Min. 507,188 
(Range 15,502 
fMax. 517,260 


481,670 

36,660 

522,781 


018,330 


494,004 
36,660 

531,569 


018,330 


19 


3,854 


508,788 


{Min. 496,915 
(Range 20,345 
(Max. 519,824 


477,219 

45,562 

531,369 


022,781 


480,007 

45,562 

534,641 


022,'78 1 


20 


2,032 


503,272 


{Min. 485,030 
(Range 34,794 
(Max. 541,256 


468,631 

62,738 

535,228 


031,369 


471,903 

62,738 

550,786 


031,369 


21 


1,612 


515,587 


{Min. 500,000 
(Range 41,256 
fMax. 518,619 
'Min. 480,771 


464,772 

70,4->6 

536,332 


035,228 


480,388 

70,398 

542,497 


035,199 


22 


1,515 


506,165 


463,668 


036,332 


469,833 


036,332 








(Range 37,848 


72,664 




72,664 










[Max. 524,281 


543,715 




549,897 




23 


1,047 


506,193 


{Min. 495,274 
(Range 29,007 
fMax. 532,184 


456,285 

87,430 

548,280 


043,715 


462,489 

87,408 

565,827 


043,704 


24 


858 


517,578 


{Min. 498,195 
(Range 33,998 
Max. 527,246 


451,720 

96,560 

552,631 


048,280 


469,329 

96,498 

557,122 


048,249 


25 


722 


504,491 


Min. 466,942 
(Range 60,304 
(Max. 545,031 


447,369 
105,262 
557,026 


052,631 


451,860 
105,262 
573,440 


052^631 


26 


615 


516,449 


{ Min. 492,228 
(Range 52,803 
fMax. 554,113 


442,974 
114,052 
567,727 


057,026 


459,458 
113,982 
583,211 


056,991 


27 


436 


515,557 


{Min. 493,333 
(Range 60,780 
(Max. 556,180 


432,273 
135,454 
573,722 


067,727 


447,903 
135,308 
593,267 


067,654 


28 


368 


519,599 


{Min. 482,500 
(Range 73,680 


426,278 


073,722 


445,931 


073,688 








147,444 




147,336 










(Max. 543,798 


585,907 




595,711 




29 


271 


509,833 


{Min. 436,782 
(Range 107,016 


414,093 
171,814 


085,907 


423,955 
171,756 


085,878 



* Formula I. is based on the assumption that the chances of two events (in this 

/ 2 
case, a male or female birth) are equal. The formula is / — where /* = the 

total average births for one year, as given in the second column of the Table. 

t Formula II. is the well-known formula for calculating the limits of error to 
which any number of observations on two alternative events are liable. This 

formula is - + 2 / ?•"*•" w here m = the total of one of the two events m and », 

and p their sum, or m -f- n. 



